Lexically Constrained Knowledge Distillation for Neural Machine Translation

نویسندگان

چکیده

Knowledge distillation is a representative approach in neural machine translation (NMT) for compressing large model into lightweight one. This first trains strong teacher model, and then forces more compact student to imitate the teacher. Although key successful knowledge constructing stronger using state-of-the-art NMT may remain inadequate owing errors. Accordingly, an severely degrades due error propagation, especially regarding words important sentence meaning. To mitigate degradation problem, we propose method lexical constraint as privileged information NMT. The proposed with constraint, list of automatically extracted from target training data. We configure according importance fallibility Models trained our result improved compared those baseline English↔German English↔Japanese tasks under condition without ensemble decoding beam-search decoding.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble Distillation for Neural Machine Translation

Knowledge distillation describes a method for training a student network to perform better by learning from a stronger teacher network. In this work, we run experiments with different kinds of teacher networks to enhance the translation performance of a student Neural Machine Translation (NMT) network. We demonstrate techniques based on an ensemble and a best BLEU teacher network. We also show ...

متن کامل

Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization

Although neural machine translation has made significant progress recently, how to integrate multiple overlapping, arbitrary prior knowledge sources remains a challenge. In this work, we propose to use posterior regularization to provide a general framework for integrating prior knowledge into neural machine translation. We represent prior knowledge sources as features in a log-linear model, wh...

متن کامل

Pre-Translation for Neural Machine Translation

Recently, the development of neural machine translation (NMT) has significantly improved the translation quality of automatic machine translation. While most sentences are more accurate and fluent than translations by statistical machine translation (SMT)-based systems, in some cases, the NMT system produces translations that have a completely different meaning. This is especially the case when...

متن کامل

Bilingually-constrained Phrase Embeddings for Machine Translation

We propose Bilingually-constrained Recursive Auto-encoders (BRAE) to learn semantic phrase embeddings (compact vector representations for phrases), which can distinguish the phrases with different semantic meanings. The BRAE is trained in a way that minimizes the semantic distance of translation equivalents and maximizes the semantic distance of nontranslation pairs simultaneously. After traini...

متن کامل

Neural Name Translation Improves Neural Machine Translation

In order to control computational complexity, neural machine translation (NMT) systems convert all rare words outside the vocabulary into a single unk symbol. Previous solution (Luong et al., 2015) resorts to use multiple numbered unks to learn the correspondence between source and target rare words. However, testing words unseen in the training corpus cannot be handled by this method. And it a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Shizen gengo shori

سال: 2022

ISSN: ['1340-7619', '2185-8314']

DOI: https://doi.org/10.5715/jnlp.29.1082